Hiding Miss Latencies with Multithreading on the Data Diffusion Machine
نویسندگان
چکیده
Large parallel computers require techniques to tolerate the potentially large latencies of accessing remote data. Multithreading is one such technique. We extend previous studies of multithreading by investigating its use on the Data Diffusion Machine (DDM), a virtual shared memory machine in which data migrates according to its use. We use a detailed emulator to study DDM’s with up to 72 nodes, allowing the scalability of multithreading to be tested further than in other studies. The results are promising and show that the applications tested can all benefit from multithreading on the DDM. Most applications however reach the ceiling of their parallelism. We briefly discuss how the results may generalise to other architectures.
منابع مشابه
The Synergy of Multithreading and Access/Execute Decoupling
This work presents and evaluates a novel processor microarchitecture which combines two paradigms: access/ execute decoupling and simultaneous multithreading. We investigate how both techniques complement each other: while decoupling features an excellent memory latency hiding efficiency, multithreading supplies the in-order issue stage with enough ILP to hide the functional unit latencies. Its...
متن کاملImproving Latency Tolerance of Multithreading through Decoupling
ÐThe increasing hardware complexity of dynamically scheduled superscalar processors may compromise the scalability of this organization to make an efficient use of future increases in transistor budget. SMT processors, designed over a superscalar core, are therefore directly concerned by this problem. This work presents and evaluates a novel processor microarchitecture which combines two paradi...
متن کاملCharacterizing the Sort Operation on Multithreaded Architectures
The Sort operation is a core part of many critical applications. Despite the large efforts to parallelize it, the fact that it suffers from high data-dependencies vastly limits its performance. Multithreaded architectures are emerging as the most demanding technology in leading-edge processors. These architectures include Simultaneous Multithreading, Chip Multiprocessors and machines combining ...
متن کاملEvaluating the Performance of Multithreading and Prefetching in Multiprocessors
This paper presents new analytical models of the performance benefits of multithreading and prefetching, and experimental measurements of parallel applications on the MIT Alewife multiprocessor. For the first time, both techniques are evaluated on a real machine as opposed to simulations. The models determine the region in the parameter space where the techniques are most effective, while the m...
متن کاملADAM: a decentralized parallel computer architecture featuring fast thread and data migration and a uniform hardware abstraction
The furious pace of Moore’s Law is driving computer architecture into a realm where the the speed of light is the dominant factor in system latencies. The number of clock cycles to span a chip are increasing, while the number of bits that can be accessed within a clock cycle is decreasing. Hence, it is becoming more difficult to hide latency. One alternative solution is to reduce latency by mig...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995